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PRELIMINARY AME]NfDMENT 



Sir: 



Prior to Examination, please amend the above-identified application as follows 

In the Specification; 

Page 1, at Ime 2, please insert the following paragraph: 
-BACKGROUND OF THE INVENTION-" 

Page 2, at line 10, please insert the following paragraph: 
"SUMMARY OF THE INVENTION- 

Page 2, delete lines 29-32. 

Page 2, at line 33, please insert the following paragraph: 
-BRIEF DESCRIPTION OF THE DRAWINGS- 

Page 4, at line 7, insert the following paragraph: 
-DETAILED DESCRIPTION- 

In the Abstract: 

— A fundamental frequency of the audio signal is estimated, and a spectrum of the 
audio signal is determined through a transform in the frequency domain of a frame of the 
audio signal. Data for coding a harmonic component of the audio signal, comprising data 
representative of spectral amplitudes associated wifli frequencies which are multiples of the 
fundamental frequency, are included in a digital output stream. The spectral amplitude 
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associated with one of the multiple frequencies is a local maximum of the modulus of the 
spectrum in the neighborhood of this multiple frequency. The data representative of spectral 
amplitudes associated with the multiple frequencies are obtained by means of cepstral 
coefficients calculated by transforming in the cepstral domain a compressed upper envelope 
of the spectrum of the audio signal.— 

In the Claims: 

Amend the following claims: 

1 . (Amended) A method of coding an audio signal, comprising the steps of: 
estimating a fundamental frequency of the audio signal; 

determining a spectrum of the audio signal through a transform into the frequency 
domain of a frame of the audio signal; 

calculating cepstral coefficients by transforming in tfie cepstral domain a compressed 
upper envelope of the spectrum of the audio signal; 

obtaining data representative of spectral amplitudes associated with frequencies 
multiple of the fimdamental frequency by means of the calculated cepstral coefficients; and 

including data for coding a harmonic component of the audio signal, comprising said 
data representative of the spectral amplitudes associated with frequencies multiple of the 
fimdamental frequency, in a digital output stream, 

wherein the spectral amplitude associated with one of said frequencies multiple of the 
fimdamental frequency is a local maximum of the modulus of the spectrum in the 
neighborhood of said multiple frequency. 

2. (Amended) The method as claimed in claim 1, fixrther comprising the step of: 
determining the compressed upper envelope by interpolation of said spectral 

amplitudes associated with the frequencies multiple of the fimdamental frequency, with 
application of a spectral compression fimction. 

3. (Amended) The method as claimed in claim 2, wherein the interpolation is 
performed between points each having a frequency multiple of the fimdamental frequency as 
an abscissa and a spectral amplitude, compressed or uncompressed, associated with said 
multiple frequency as an ordinate. 



4. (Amended) The method as claimed in claim 1, wherein the transformation in 
the cepstral domain of the compressed upper envelope is performed according to a nonlinear 
frequency scale. 

5. (Amended) The method as claimed in claim 1, further comprising the steps 

of: 

quantizing tfie cepstral coefficients to form said data representative of the spectral 
amplitudes associated with the frequencies multiple of the fundamental frequency. 

6. (Amended) The method as claimed in claim 5, wherein the quantization of the 
cepstral coefficients is performed on a prediction residual for each of the cepstral coefficients. 

7. (Amended) The method as claimed in claim 6, wherein the prediction residual 
for a cepstral coefficient is of the form (cx[n,i] - a(i) rcx_q[n-l,i])/[2-a(i)], where cx[n,i] 
designates a current value of said cepstral coefficient, rcx_q[n-l,i] designates a previous 
value of the quantized prediction residual, and a(i) designates a prediction coefficient 

8. (Amended) The method as claimed in claim 6, further comprising the step of 
using different predictors to determine the prediction residuals for at least two of the cepstral 
coefficients. 

9. (Amended) The method as claimed in claim 5, wherein the cepstral 
coefficients are distributed into several cepstral subvectors quantized separately by a vector 
quantization performed on a prediction residual of the cepstral coefficients. 

10. (Amended) The method as claimed in claim 5, wherein the cepstral 
coefficients are normalized before quantization, by modifying the cepstral coefficient of order 
0 so that the spectral amplitude associated with a frequency multiple of the fundamental 
frequency is represented exactly by fee normalized cepstral coefficients. 

1 1 . (Amended) The method as claimed in claim 5, further comprising the step of: 
transforming the cepstral coefficients by liftering in the cepstral domain prior to 

quantization. 



12. (Amended) The method as claimed in claim 11, wherein the liftering is of the 
form Cp(i) = [l-i72*"'yi^]-^©-(l^^/0j where Cp(i) and c(i) designate the cepstral coefficient of 
order i>0 respectively before and after liftering, and Y2 are coefficients lying between 0 
and 1 and ju, is a pre-emphasizing coefficient 

13. (Amended) The method as claimed in claim 12, wherein |i = (Y2 " yi)'^W' 

14. (Amended) The method as claimed in claim 11, further comprising the steps 

of: 

recalculating a value of the modulus of the spectrum of the audio signal at at least one 
frequency multiple of the fundamental frequency on the basis of the transformed and 
quantized cepstral coefficients; and 

adapting said liftering so as to minimize a discrepancy in modulus between the 
spectrum of the audio signal and at least one recalculated modulus value. 

15. (Amended) The method as claimed in claim 11, further comprising the steps 

of: 

recalculating a value of the modulus of the spectrum of the audio signal at at least one 
frequency multiple of the fundamental frequency on the basis of the transformed and 
quantized cepstral coefficients; 

retransforming the cepstral coefficients by liftering and smoothing in the cepstral 
domain; 

calculating minimum phases of the audio signal at frequencies multiple of the 
fundamental firequency on the basis of the retransformed cepstral coefficients; and 

adapting the liftering performed prior to quantization so as to minimize a deviation 
between the spectrum of the audio signal and at least one complex value having a modulus 
value recalculated for a firequency multiple of the fundamental frequency and a phase value 
given by the minimum phase calculated for said multiple firequency. 

16. (Amended) The method as claimed in claim 15, wherein the Hfterings 
performed before and after quantization are adapted jointly so as to minimize said deviation, 



and wherein parameters representative of the adapted liftering performed after quantization 
are included in the data for coding the harmonic component. 

17. (Amended) The method as claimed in claim 14, wherein the minimized 
discrepancy for the adaptation of the liftering relates to at least one frequency multiple of the 
fimdamental frequency, selected on the basis of the magnitude of the modulus of the 
spectrum in absolute value. 

1 8. (Amended) The method as claimed in claim 14, further comprising the step of 
estimating a curve of spectral masking of the audio signal by means of a psycho-acoustic 
model, and wherein the minimized discrepancy for the adaptation of the liftering relates to at 
least one frequency multiple of the fimdamental frequency, selected on the basis of the 
magnitude of the modulus of the spectrum in relation to the masking curve. 

19. (Amended) The method as claimed in claim 1, wherein the spectrum of the 
audio signal and the cepstral coefficients resulting from the transformation of the compressed 
upper envelope are determined for successive mutually overlapping frames of N samples of 
the audio signal, and wherein said data representative of spectral amplitudes associated with 
the frequencies multiple of the estimated fimdamental frequency, obtained by means of the 
cepstral coefficients calculated by transforming the compressed upper envelope, are included 
in tiie digital output stream for just one subset of the frames, 

20. (Amended) The method as claimed in claim 19, wherein, for the frames 
which do not form part of said subset, data for quantizing an error of interpolation of the 
cepstral coefficients resulting from the transformation of the compressed upper envelope are 
included in the digital output stream. 

21. (Amended) The method as claimed in claim 19, wherein, for the frames 
which do not form part of said subset, an optimal interpolator filter is determined for the 
cepstral coefficients resulting from the transformation of the compressed upper envelope and 
data representing said optimal interpolator filter are included in the digital output stream. 

22. (Amended) An audio coder, comprising: 

means for estimating a fimdamental frequency of an audio signal; 



means for determining a spectrum of the audio signal through a transform into the 
frequency domain of a frame of the audio signal; 

means for calculating cepstral coefficients by transforming in the cepstral domain a 
compressed upper envelope of the spectrum of the audio signal; 

means for obtaining data representative of spectral amplitudes associated with 
frequencies multiple of the fimdamental frequency by means of the calculated cepstral 
coefficients; and 

means for outputting a digital stream including data for coding a harmonic component 
of the audio signal, 

wherein the data for coding a harmonic component of the audio signal include said 
data representative of spectral amplitudes associated with frequencies multiple of the 
fundamental frequency, and wherein the spectral amplitude associated with one of said 
frequencies multiple of the fundamental frequency is a local maximum of the modulus of the 
spectrum in the neighborhood of said multiple frequency. 

23. (Amended) The audio coder as claimed in claim 22, further comprising: 
means for determining the compressed upper envelope by interpolation of said 

spectral amplitudes associated with the frequencies multiple of the frmdamental frequency, 
with application of a spectral compression function. 

Add the following claims: 

24. (New) The audio coder as claimed in claim 23, wherein the interpolation is 
performed between points each having a frequency multiple of the fundamental frequency as 
an abscissa and a spectral amplitude, compressed or uncompressed, associated with said 
multiple frequency as an ordinate. 

25. (New) The audio coder as claimed in claim 22, wherein the transformation in 
the cepstral domain of the compressed upper envelope is performed according to a nonlinear 
frequency scale. 

26. (New) The audio coder as claimed in claim 22, further comprising: 

means for quantizing the cepstral coefficients to form said data representative of the 
spectral amplitudes associated with the frequencies multiple of the fundamental frequency. 
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27. (New) The audio coder as claimed in claim 26, wherein the quantization of 
the cepstral coefficients is performed on a prediction residual for each of the cepstral 
coefficients. 

28. (New) The audio coder as claimed in claim 27, wherein the prediction 
residual for a cepstral coefficient is of the form (cx[n,i] - a(i) rcx_q[n-l,i])/[2-a(i)], where 
cx[n,i] designates a current value of said cepstral coefficient, rcx_q[n-l,i] designates a 
previous value of the quantized prediction residual, and a(i) designates a prediction 
coefficient. 

29. (New) The audio coder as claimed in claim 27, further comprising a plurality 
of different predictors to determine the prediction residuals for at least two of the cepstral 
coefficients. 

30. (New) The audio coder as claimed in claim 26, wherein the cepstral 
coefficients are distributed into several cepstral subvectors quantized separately by a vector 
quantization performed on a prediction residual of the cepstral coefficients. 

31. (New) The audio coder as claimed in claim 26, wherein the cepstral 
coefficients are normalized before quantization, by modifying the cepstral coefficient of order 
0 so that die spectral amplitude associated with a frequency multiple of the fundamental 
frequency is represented exactly by the normalized cepstral coefficients. 

32. (New) The audio coder as claimed in claim 26, further comprising: 

means for transforming die cepstral coefficients by liftering in the cepstral domain 
prior to quantization. 

33. (New) The audio coder as claimed in claim 32, wherein the liftering is of the 
form Cp(i) = [l4Y2^-yi^].c(i)-(|aVi), where Cp(i) and c(i) designate the cepstral coefficient of 
order i>0 respectively before and after liftering, and 72 are coefficients lying between 0 
and 1 and |x is a pre-emphasizing coefficient 



34. (New) The audio coder as claimed in claim 33, wherein |li = (72 - 7i)-c(l). 



35. (New) The audio coder as claimed in claim 32, further comprising: 

means for recalculating a value of the modulus of the spectrum of the audio signal at 
at least one frequency multiple of the fundamental frequency on the basis of the transformed 
and quantized cepstral coefficients; and 

means for adapting said liflering so as to minimize a discrepancy in modulus between 
the spectrum of the audio signal and at least one recalculated modulus value. 

36. (New) The audio coder as claimed in claim 32, further comprising: 

means for recalculating a value of the modulus of the spectrum of the audio signal at 
at least one frequency multiple of the fundamental frequency on the basis of the transformed 
and quantized cepstral coefficients; 

means for retransforming the cepstral coefficients by liflering and smoothing in the 
cepstral domain; 

means for calculating minimum phases of the audio signal at frequencies multiple of 
the fundamental frequency on the basis of the retransformed cepstral coefficients; and 

means for adapting the liflering performed prior to quantization so as to minimize a 
deviation between the spectrum of the audio signal and at least one complex value having a 
modulus value recalculated for a frequency multiple of the fundamental frequency and a 
phase value given by the minimum phase calculated for said multiple frequency. 

37. (New) The audio coder as claimed in claim 35, wherein the lifterings 
performed before and after quantization are adapted jointly so as to minimize said deviation, 
and wherein parameters representative of the adapted Hftering performed after quantization 
are included in die data for coding the harmonic component. 

38. (New) The audio coder as claimed in claim 35, wherein the minimized 
discrepancy for the adaptation of the liflering relates to at least one frequency multiple of the 
fundamental frequency, selected on the basis of the magnitude of the modulus of the 
spectrum in absolute value. 
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39. (New) The audio coder as claimed in claim 35, further comprising means for 
estimating a curve of spectral masking of the audio signal by means of a psycho-acoustic 
model, and wherein the minimized discrepancy for the adaptation of the liftering relates to at 
least one frequency multiple of the fundamental frequency, selected on the basis of the 
magnitude of the modulus of the spectrum in relation to the masking curve. 

40. (New) The audio coder as claimed in claim 22, wherein the spectrum of the 
audio signal and the cepstral coefficients resulting from the transformation of the compressed 
upper envelope are determined for successive mutually overlapping frames of N samples of 
the audio signal, and wherein said data representative of spectral amplitudes associated with 
the frequencies multiple of the estimated fundamental frequency, obtained by means of the 
cepstral coefficients calculated by transforming the compressed upper envelope, are included 
in the digital output stream for just one subset of the frames. 

4L (New) The audio coder as claimed in claim 40, wherein^ for the frames which 
do not form part of said subset, data for quantizing an error of interpolation of the cepstral 
coefficients resulting from the transformation of the compressed upper envelope are included 
in the digital output stream. 

42. (New) The audio coder as claimed in claim 40, wherein, for the frames which 
do not form part of said subset, an optimal interpolator filter is determined for the cepstral 
coefficients resulting from the transformation of the compressed upper envelope and data 
representing said optimal interpolator filter are included in the digital output stream. 
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Remarks; 

Allowance of all claims is respectfully requested. The Commissioner is authorized to 
charge any additional fees under 37 C.F.R. § 1.16 and § 1.17, or credit any overpayment to 
Deposit Account No. 20-1504 (MTR.0028US). 



Date: 




Respectfully submitted. 




Dan C. Hu, Registration No. 40,025 
TROP, PRUNER & HU, P.C. 
8554 Katy Freeway, Suite 100 
Houston, Texas 77024-1805 
(713) 468-8880 [Phone] 
(713) 468-8883 [Fax] 
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VERSIONS WITH MARKD^GS TO SHOW CHANGES 
IN THE CLAIMS : 

New claims 24-42 have been added. Amendments of the claims are indicated below: 
1. (Amended) A method of coding an audio signal, comprising die steps 

o£ 

[signal (x), in which] estimating a fundamental fiequency [(Fo)] of the audio signal: 

[signal is estimated,] determining a spectrum of the audio signal [is determined] 
through a transform into the frequency domain of a frame of the audio signal: 

[signal, and] calculating cepstral coefficients bv transforming in the cepstral domain a 
compressed upper envelope of tfie spectrum of the audio signal: 

[data for coding a harmonic component of the audio signal, comprising] obtaining 
data representative of spectral amplitudes associated with frequencies [which are multiples] 
multiple of the fimdamental frequency bv means of the calculated cepstral coefficients: and 

including data for coding a harmonic component of the audio signal, comprising said 
data representative of the spectral amplitudes associated with frequencies multiple of the 
ftmdamental frequency, [are included] in a digital output stream. 

[stream (O), characterized in that] wherein the spectral amplitude associated with one 
of said frequencies [which are multiples] multiple of the ftmdamental frequency is a local 
maximum of the modulus of the spectrum in the neighborhood of said multiple frequency. 

2, (Amended) The method as claimed in claim 1, fiirther comprising the step of: 
[in which said data representative of spectral amplitudes associated with frequencies which 
are multiples of the fimdamental frequency (Fo) are obtained by means of cepstral 
coefficients (cx sup) calculated by transforming in the cepstral domain a compressed upper 
envelope (LX_sup) of the spectrum of die audio signal] 

determining the compressed upper envelope by interpolation of said spectral 
amplitudes associated with the frequencies multiple of the fundamental frequency, with 
a pplication of a spectral compression function . 

3. (Amended) The method as claimed in claim 2, wherein the interpolation is 
performed between points each having a frequency multiple of the fundamental frequency as 
an abscissa and a spectral amplitude, compressed or uncompressed, associated with said 
multiple frequency as an ordinate [in which the compressed upper envelope (LX_sup) is 



determined by interpolation of said spectral amplitudes associated with the frequencies which 
are multiples of the fundamental frequency (Fo), with application of a spectral compression 
function]. 

4. (Amended) The method as claimed in claim L wherein the transformation in 
the cepstral domain of the compressed upper envelope is performed according to a nonlinear 
frequency scale [3, in which the interpolation is performed between points whose abscissa is 
a frequency which is a multiple of the fundamental frequency (Fo) and whose ordinate is the 
spectral amplitude associated with said multiple frequency, compressed or uncompressed], 

5. (Amended) The method as claimed in claim 1, further comprising the steps 
of: [any one of claims 2 to 4, in which the transformation in the cepstral domain of the 
compressed upper envelope (LX_sup) is performed according to a nonlinear frequency scale] 

quantizing the cepstral coefficients to form said data representative of the spectral 
amphtudes associated with the frequencies multiple of the fundamental frequency . 

6. (Amended) The method as claimed in claim 5, wherein the quantization of the 
cepstral coefficients is performed on a prediction residual for each of the cepstral coefficients 
[any one of claims 2 to 5, in which the cepstral coefficients (cx sup) are quantized so as to 
form said data representative of the spectral amplitudes associated with the frequencies which 
are multiples of the fundamental frequency (Fo)]. 

7. (Amended) The method as claimed in claim 6, wherein the prediction residual 
for a cepstral coefficient is of the form (cx[ni] - afi) icx qrn-lalVr2-affl1. where cxFnjl 
designates a current value of said cepstral coefficient rex qfn-lg] designates a previous 
value of the quantized prediction residual and a(i) designates a prediction coefficient [in 
which the quantization of the cepstral coefficients (cx_sup) pertains to a prediction residual 
for each of the cepstral coefficients]. 

8. (Amended) The method as claimed in claim 6, further comprising the step of 
using different predictors to determine tfie prediction residuals for at least two of the cepstral 
coefficients [7, in which the prediction residual for a cepstral coefficient is of the form 
(cx[n,i] - a(i) rcx_q[n-l,i])/[2-a(i)], where cx[n,i] designates a current value of said cepstral 



coefficient, rcx_q[n-l,i] designates a preAdous value of liie quantized prediction residual, and 
a(i) designates a prediction coefficient]. 

9. (Amended) The method as claimed in claim S, wherein the cepstral 
coefficients are distributed into several cepstral subvectors quantized separately by a vector 
quantization performed on a prediction residual of the cepstral coefficients [7 or 8, in which 
different predictors are employed to determine the prediction residuals for at least two of the 
cepstral coefficients]. 

10. (Amended) The method as claimed in claim 5. wherein the cepstral 
coefficients are normalized before quantization, by modifying the cepstral coefficient of order 
0 so that the spectral amplitude associated with a frequency multiple of the fundamental 
firequency is represented exactly by the normalized cepstral coefficients [any one of claims 6 
to 9, in which the cepstral coefficients (cx sup) are distributed into several cepstral 
subvectors quantized separately by a vector quantization pertaining to a prediction residual of 
the cepstral coefficients.]. 

1 1 . (Amended) The method as claimed in claim 5, further comprising the step of: 
transforming the cepstral coefficients by littering in the cepstral domain prior to 

quantization [any one of claims 6 to 10, in which the cepstral coefficients (cx_sup) are 
normalized before quantization, by modifying the cepstral coefficient of order 0 in such a 
way that the spectral amplitude associated with a frequency which is a multiple of the 
fundamental firequency (Fo) is represented exactly by the normalized cepstral coefficients]. 

12. (Amended) The method as claimed in claim 1 1 , wherein the littering is of the 

form c^(i) = [l-^2 ^' y i ^]'^(^^~(P^^^^^ where c^(i) and cfi) designate the cepstral coefficient of 

order i>0 respectively before and after littering, and 73 ^^e coefficients lying between 0 

and 1 and fi is a pre-emphasizing coefficient [any one of claims 6 to 11, in which the cepstral 
coefficients (cx._sup) are transformed by liflering in the cepstral domain before being 
quantized]. 



13. (Amended) The method as claimed in claim 12, wherein u = (y2 - Y l ^-cf 1) . [in 

which the liftering is of the form Cp(i) = [1 + - yi*]-c(i) - (M-Vj), where Cp(i) and c(i) 
designate the cepstral coefficient of order i>0 respectively before and after liftering, yi and 72 
are coefficients lying between 0 and 1 and |a is a pre-emphasizing coefficient] 

14. (Amended) The method as claimed in claim IL fiirther comprising the steps 
of : [13, in which = (72 - yi).c(l)] 

recalculating a value of the modulus of the spectrum of the audio signal at at least one 
firequencv multiple of the fundamental frequency on the basis of the transformed and 
quantized cepstral coefficients: and 

adapting said liftering so as to minimize a discrepancy in modulus between the 
spectrum of the audio signal and at least one recalculated modulus value. 

15. (Amended) The method as claimed in [any one of claims 12 to 14, in which a 
value of the modulus of the spectrum of the audio signal at at least one frequency which is a 
multiple of the fimdamental frequency (Fo) is recalculated on the basis of the transformed and 
quantized cepstral coefficients (cx__sup_q), and said liftering is adapted in such a way as to 
minimize a discrepancy in modulus between the spectrum of the audio signal and at least one 
recalculated modulus value] claim 1 1 , fiirther comprising the steps of: 

recalculating a value of the modulus of the spectrum of the audio signal at at least one 
frequency multiple of the fimdamental frequency on the basis of the transformed and 
quantized cepstral coefficients: 

retransforming the cepstral coefficients by liftering and smoothing in the cepstral 
domain: 

calculating minimum phases of the audio signal at frequencies multiple of the 
fimdamental frequency on the basis of the retransformed cepstral coefficients: and 

adapting the liftering performed prior to quantization so as to minimize a deviation 
between the spectrum of the audio signal and at least one complex value having a modulus 
value recalculated for a frequency multiple of the fimdamental frequency and a phase value 
given by the minimum phase calculated for said multiple frequency . 

16. (Amended) The method as claimed in claim 15. wherein the lifterings 
performed before and after quantization are adapted joinflv so as to minimize said deviation. 



and wherein parameters representative of the adapted liftering performed after quantization 
are included in the data for coding the harmonic component [any one of claims 12 to 14, in 
which a value of the modulus of the spectrum of the audio signal at at least one frequency 
which is a multiple of the fundamental frequency (Fo) is recalculated on the basis of the 
transformed and quantized cepstral coefficients (cx_sup_q), the cepstral coefficients are 
retransformed by liftering and smoothing in the cepstral domain, minimum phases (cp(k)) of 
the audio signal at frequencies which are multiples of the fundamental frequency are 
calculated on the basis of the retransformed cepstral coefficients (cxl[n]), and the liftering 
performed before the quantization is adapted in such a way as to minimize a deviation 
between the spectrum of the audio signal and at least one complex value whose modulus has 
a value recalculated for a frequency which is a multiple of the fimdamental frequency and 
whose phase is given by tiie minimum phase calculated for said multiple frequency]. 

Q 

17. (Amended) The method as claimed in claim 14. wherein the minimized 
^ discrepancy for the adaptation of the liftering relates to at least one frequency multiple of the 

^ fundamental frequency, selected on the basis of the magnitude of the modulus of the 

^ spectrum in absolute value [16, in which the lifterings performed before and after 

5 quantization are adapted jointly so as to minimize said discrepancy, and in which parameters 

(iLif) representative of the adapted liftering performed after quantization are included in the 
Q data for coding the harmonic component]. 

n§ 18. (Amended) The method as claimed in claim 14, further comprising the step of 

estimating a curve of spectral masking of the audio signal by means of a psycho-acoustic 
model and wherein the minimized discrepancy for the adaptation of the liftering relates to at 
least one frequency multiple of the fundamental frequency, selected on the basis of the 
magnitude of the modulus of the spectrum in relation to the masking curve [any one of claims 
15 to 17, in which the minimized discrepancy for the adaptation of the liftering relates to at 
least one frequency which is a multiple of the fundamental frequency (Fo), selected on the 
basis of the magnitude of the modulus of the spectrum in absolute value]. 

19. (Amended) The method as claimed in claim 1 , wherein the spectrum of the 
audio signal and the cepstral coefficients resulting from the transformation of the compressed 
upper envelope are determined for successive mutually overlapping frames of N samples of 



the audio signal, and wherein said data representative of spectral amplitudes associated with 
the frequencies multiple of the estimated fundamental frequency, obtained by means of the 
cepstral coefficients calculated by transforming the compressed upper envelope, are included 
in the digital output stream for just one subset of the frames [any one of claims 15 to 17, in 
which a curve of spectral masking of the audio signal is estimated by means of a psycho- 
acoustic model, and the minimized discrepancy for the adaptation of the Hftering relates to at 
least one frequency which is a multiple of the fundamental frequency (Fo), selected on the 
basis of the magnitude of the modulus of the spectrum in relation to the masking curve]. 

20. (Amended) The method as claimed in claim 19, wherein, for the frames 
which do not form part of said subset data for quantizing an error of interpolation of the 
cepstral coefficients resulting from the transformation of the compressed upper envelope are 
included in the digital output stream [2, in which the spectrum of the audio signal and the 
cepstral coefficients (cx__sup) resulting from the transformation of the compressed upper 
envelope are determined for successive frames of N samples of the audio signal which exhibit 
mutual overlaps, and in which said data representative of spectral amplitudes associated with 
the frequencies which are multiples of the estimated fundamental frequency (Fq), obtained by 
means of the cepstral coefficients calculated by transforming flie compressed upper envelope, 
are included in the digital output stream (O) for just one subset of the frames]. 

21. (Amended) The method as claimed in claim 19. wherein, for the frames 
which do not form part of said subset, an optimal interpolator filter is determined for the 
cepstral coefficients resulting from the transformation of the compressed upper envelope and 
data representing said optimal interpolator filter are included in the digital output stream [20, 
in which, for the frames which do not form part of said subset, data (icx[n-l/2]) for 
quantizing an error (ecx[n-l/2]) of interpolation of the cepstral coefficients resulting from the 
transformation of the compressed upper envelope (LX_sup) are included in the digital output 
stream (O)]. 



22. (Amended) An audio coder, comprising: 
means for estimating a fundamental frequency of an audio signal; 
means for determining a spectrum of the audio signal through a transform into the 
frequency domain of a frame of the audio signal: 



means for calculating cepstral coefficients by transforming in the cepstral domain a 
compressed upper envelope of the spectrum of the audio signal; 

means for obtaining data representative of spectral amplitudes associated with 
frequencies multiple of the fundamental frequency bv means of the calculated cepstral 
coefficients; and 

means for outputting a digital stream including data for coding a harmonic component 
of the audio signal. 

wherein the data for coding a harmonic component of the audio signal include said 
data representative of spectral amplitudes associated with frequencies multiple of the 
fundamental frequency, and wherein the spectral amplitude associated with one of said 
frequencies multiple of the fundamental frequency is a local maximum of the modulus of the 
spectrum in the neighborhood of said multiple freq uency [The method as claimed in claim 
20, in which, for the frames which do not form part of said subset, an optimal interpolator 
filter (128) is determined for the cepstral coefficients resulting from the transformation of the 
compressed upper envelope (LX sup) and data (iP) representing said optimal interpolator 
^ filter are included in the digital output stream (O)]. 

23. (Amended) The audio coder as claimed in claim 22. further comprising: 
means for determining the compressed upper envelope by interpolation of said spectral 
amplitudes associated with the frequencies multiple of the fundamental frequency, with 
application of a spectral compression function [An audio coder, comprising means for 
executing a method according to any one of claims 1 to 22]. 
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AUDIO CODING WITH HARMONIC COMPONENTS 

The present invention relates to the field of the 
coding of ■ audio signals. It applies in particular, but 
5 not exclusively, to the coding of speech, in narrowband 
or in broadband, in various coding bit rate ranges. 

The design of an audio codec is aimed chiefly at 
providing a good compromise between the bit rate of the 
10 stream transmitted by the coder and the quality of the 
audio signal which the decoder is capable of 
reconstructing from this stream. 

With this in mind, families of coders have in 
15 particular been developed which are based on analyzing 

the audio signal - in the spectral domain: the coder 

estimates a fundamental frequency of the signal, 
representing its pitch, and the spectral analysis 
consists in determining parameters representing the 
20 harmonic structure of the signal at the frequencies 
which are integer multiples of this fundamental 
frequency. Modeling of the nonharmonic, or unvoiced, 
component may also be performed in the spectral domain. 
The parameters transmitted to the decoder typically 
25 represent the modulus of the spectrum of the voiced and 
unvoiced components. Added thereto is information 
representing either voiced/unvoiced decisions relating 
to various portions of the spectrum, or information 
regarding the probability of voicing of the signal, 
30 allowing the decoder to determine those portions of the 
spectrum in which it must use the voiced component or 
the unvoiced component. 

These families of coders comprise the coders of the MBE 
35 type (standing for ^'Multi-Band Excitation'') , or else 
the coders of the STC type (standing for ^^Sinusoidal 
Transform Coder") . By way of reference, mention may be 
made of US patents 4 856 068, 4 885 790, 4 937 873, 
5 054 072, 5 081 6S1, 5 195 166, 5 216 747, 5 226 084, 
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5 226 108, 5 247 579, 5 473 727, 5 517 511, 5 630 Oil, 
5 630 012, 5 649 050, 5 651 093, 5 664 051, 5 664 052, 
5 684 926, 5 701 390, 5 715 365, 5 749 065, 5 752 222, 
5 765 127, 5 774 837 and 5 890 108. 

5 

An aim of the present invention is to make it possible 
to improve the modeling of the modulus of the spectrum 
of the signal, in a coding scheme with analysis in the 
spectral domain. 

10 

The invention thus proposes a method of coding an audio 
signal, in which a fundamental frequency of the audio 
signal is estimated, a spectrum of the audio signal is 
determined through a transform in the frequency domain 

15 of a frame of the audio signal, and data for coding a 
harmonic component of the audio signal, comprising data 
representative of spectral amplitudes associated with 
frequencies which are multiples of the fundamental 
frequency, are included in a digital output stream. 

20 According to the invention, the spectral amplitude 
associated with one of said frequencies which are 
multiples of the fundamental frequency is a local 
maximum of the modulus of the spectrum in the 
neighborhood of said multiple frequency. 

25 

The invention also proposes an audio coder comprising 
means for implementing the above method. 

Other features and advantages of the present invention 
30 will become apparent in the description below of non- 
limiting exemplary embodiments, with reference to the 
appended drawings, in which: 

figure 1 is a schematic diagram of an audio coder 
35 according to the invention; 

figures 2 and 3 are charts illustrating the 
formation of the audio signal frames in the coder 
of figure 1; 

figures 4 and 5 are graphs showing an exemplary 



spectrum of the audio signal and illustrating the 
extraction of the upper and lower envelopes of 
this spectrum; 

figure 6 is a schematic diagram of an example of 
quantization means usable in the coder of 
figure 1; 

figure 7 is a schematic diagram of means usable to 
extract parameters relating to the phase of the 
nonharmonic component in a variant of the coder of 
figure 1; 

figure 8 is a schematic diagram of an audio 
decoder corresponding to the coder of figure 1; 
figure 9 is a flowchart of an exemplary procedure 
for smoothing spectral coefficients and for 
extracting minimum phases implemented in the 
decoder of figure 8; 

figure 10 is a schematic diagram of modules for 
analysis and for spectral mixing 6f harmonic and 
nonharmonic components of the audio signals- 
figures 11 to 13 are graphs showing examples of 
nonlinear functions usable in the analysis module 
of figure 10; 

figures 14 and 15 are charts illustrating a way of 
carrying out the temporal synthesis of the signal 
frames in the decoder of figure 8; 

figures 16 and 17 are graphs showing windowing 
functions usable in the synthesis of the frames 
according to figures 14 and 15; 

figures 18 and 19 are schematic diagrams of 
interpolation means usable in a variant embodiment 
of the coder and of the decoder; 

figure 20 is a schematic diagram of interpolation 
means usable in another variant embodiment of the 
coder; 

figures 21 and 22 are charts illustrating another 
way of carrying out the temporal synthesis of the 
signal frames in the decoder of figure 8, with the 
aid of an interpolation of parameters; 
figures 23 to 25 are schematic diagrams of variant 
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means of post-processing the cepstral coefficients 
representing the upper envelope of the spectrum of 
the signal in the coder of figure 1; and 
figure 26 is a partial schematic diagram of a 
5 decoder associated with a coder according to 

figure 25. 

The coder and decoder described hereinbelow are digital 
circuits which can, as is customary in the field of 
10 audio signal processing, be embodied by programming a 
- digital signal processor (DSP) or an application 

specific integrated circuit (ASIC) . 

2 '^^^ audio coder represented in figure 1 processes an 

i 5 15 audio input signal x which, in the nonlimiting exait^^le 

I ^ considered hereinbelow, is a speech signal. The signal 

I 2 X is available in digital form, for example at a 

I IM sampling frequency Fe of 8 kHz, It is, for example, 

i delivered by an analog/digital converter processing the 

-a 

I 20 amplified output signal from a microphone. The input 

I O signal x can also be formed from another version, 

1 analog or digital, coded or uncoded, of the speech 

S fW signal. 

^ 25 The coder comprises a module 1 which forms successive 

^ frames of audio signal for the various processing 

~f operations performed, and an output multiplexer 6 which 

l| delivers an output stream O containing, for each frame, 

^ sets of quantization parameters from which a decoder 

4 30 will be capable of synthesizing a decoded version of 

;j the audio signal. 

The structure of the frames is illustrated by figures 2 

^ and 3. Each frame 2 is composed of a number N of 

i 35 consecutive samples of the audio signal x. The 

~< successive frames exhibit mutual time shifts 

r:; corresponding to M samples, so that their overlap is 

f L = N-M samples of the signal. In the example 

^: considered, where N = 256, M = 160 and L = 96, the 
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duration of the frames 2 is N/Fe = 32 ms, and a frame 
is formed every M/Fe = 20 ms. 

In a conventional manner, the module 1 multiplies the 
5 samples of each frame 2 by a windowing function fj^, 
preferably chosen for its good spectral properties. The 
samples x(i) of the frame being digitized from i = 0 to 
i = N-1, the analysis window fA(i) can thus be a 
Hamming window, expressed by: 

10 fA(i) = 0.54 + 0.46. cosj^27t ^ ~ ^ "^^ ^ (1) 



or a Hanning window, expressed by: 



^ ,.v if. i - (N - 1) / 2^^ 
fA i) = - 1 + cod 2n ' 

or else a Kaiser window, expressed by: 



(2) 



fA(i) = — 



- (N - 1) / 2^^ 



N 



(3) 



15 where a is a coefficient equal, for example, to 6, and 
Io(.) designates the Bessel function of index 0. 

The coder of figure 1 carries out an analysis of the 
audio signal in the spectral domain. It comprises a 

20 module 3 which calculates the fast Fourier transform 
(FFT) of each signal frame. The signal frame is shaped 
before being subjected to the FFT module 3: the module 
1 appends N = 256 zero samples thereto so as to obtain 
the maximum resolution of the Fourier transform, and it 

25 moreover performs a circular permutation of the 
2N = 512 samples so as to compensate for the phase 
effects resulting from the analysis window. This 
modification of the frame is illustrated by figure 3. 
The frame whose fast Fourier transform is calculated on 

30 2N = 512 points commences with the last N/2 == 128 
weighted samples of the frame, followed by the N = 256 
zero samples, and terminates with the first N/2 = 128 
weighted samples of the frame. 
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The FFT module 3 obtains the spectrum of the signal for 
each frame, whose modulus and phase are respectively 
denoted |X| and (pxf or |X(i) | and <Px(i) for the 
frequency indices i = 0 to i = 2N-1 (by virtue of the 
5 symmetry of the Fourier transform and of the frames, we 
may confine ourselves to the values for 0 ^ i < N) . 



A fundamental-frequency detector 4 estimates for each 
signal frame a value of the fundamental frequency Fq . 

: 10 The detector 4 can apply any known procedure for 

: analyzing the speech signal of the frame to estimate 

~ the fundamental frequency Fo, for example a procedure 

= based on the autocorrelation function or the AMDF 

; O function, possibly preceded by a module for whitening 

I 15 by linear prediction. The estimate can also be made in 

1 the spectral domain or in the cepstral domain. Another 

J 2 possibility is to evaluate the time intervals between 

I yi the consecutive breaks in the speech signal which are 

j ^ attributable to closures of the talker's glottis 

1 n 20 occurring over the duration of the frame. Well-known 

1 O procedures which can be used to detect such microbreaks 

-a J; are described in the following articles: M. Basseville 

2 LI 

3 fU et al., ^^Sequential detection of abrupt changes in 
5 spectral characteristics of digital signals'' (IEEE 

4 25 Trans, on Information Theory, 1983, Vol. IT-29, No. 5, 
J pages 708-723) ; R. Andre-Obrecht, '"A new statistical 
r| approach for the automatic segmentation of continuous 
J . speech signals" (IEEE Trans, on Acous., Speech and Sig. 
:J Proc, Vol. 36, No. 1, January 1988); and C. MURGIA et 
J 30 al., ^^An algorithm for the estimation of glottal 
J closure instants using the sequential detection of 
^ abrupt changes in speech signals" (Signal Processing 
i VII, 1994, pages 1685-1688) . 

§ 35 The estimated fundamental frequency Fq forms the 

^ subject of a quantization, for example scalar, by a 
module 5, which provides the output multiplexer 6 with 

~; an index iF of quantization of the fundamental 

5 frequency for each frame of the signal. 
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The coder uses cepstral parametric modelings to 
represent an upper envelope and a lower envelope of the 
spectrum of the audio signal. The first step of the 
5 cepstral transformation consists in applying a spectral 
compression function to the modulus of the spectrum of 
the signal, which function may be a logarithmic or root 
function. The module 8 of the coder thus carries out, 
for each value X(i) of the spectrum of the signal 
10 (0 < i < N) , the following transformation: 

LX(i) = Log(lX{i) I) (4) 
in the case of a logarithmic compression or 

LX(i) = |X(i) 1^ (5) 
in the case of a root compression, y being an exponent 
15 lying between 0 and 1. 

The compressed spectrum LX of the audio signal is 
processed by a module 9 which extracts spectral 
amplitudes associated with the harmonics of the signal 
20 corresponding to the multiples of the estimated 
fundamental frequency FO. These amplitudes are then 
interpolated by a module 10 so as to obtain a 
compressed upper envelope denoted LX_sup. 

25 It should be noted that the spectral compression could 
equivalently be performed after determining the 
amplitudes associated with the harmonics. It could also 
be performed after interpolation, and this would merely 
modify the form of the interpolation functions. 

30 

The module 9 for extracting the maxima takes account of 
any variation in the fundamental frequency over the 
analysis frame, errors which the detector 4 may make, 
as well as inaccuracies related to the discrete nature 
35 of the frequency sampling. To do this, the search for 
the amplitudes of the spectral peaks does not consist 
simply in taking the values LX(i) corresponding to the 
indices i such that i.Fe/2N is the frequency closest to 
a harmonic of frequency k.Fo(k > 1). The spectral 
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amplitude retained for a harmonic of order k is a local 
maximum of the modulus of the spectrum in the 
neighborhood of the frequency k,Fo (this amplitude is 
obtained directly in compressed form when the spectral 
5 compression 8 is performed before the extraction of the 
maxima 9) . 

Figures 4 and 5 show an exemplary form of the 
compressed spectrum LX, where it may be seen that the 
maximum amplitudes of the harmonic peaks do not 
necessarily coincide with the amplitudes corresponding 
to the integer multiples of the estimated fundamental 
frequency Fq. Since the sides of the peaks are fairly 
steep, a small error in the positioning of the 
fundamental frequency Fo, amplified by the harmonic 
index k, may greatly distort the estimated upper 
envelope of the spectrum and cause poor modeling of the 
formant structure of the signal. For example, directly 
taking the spectral amplitude for the frequency 3.Fo in 
the case of figures 4 and 5 would produce a sizeable 
error in the extraction of the upper envelope in the 
neighborhood of the harmonic of order k = 3, although, 
in the example drawn, this relates to a zone of 
sizeable energy. By performing the interpolation on the 
basis of the actual maximum, this kind of error in 
estimating the upper envelope is avoided. 

In the example represented in figure 4, the 
interpolation is performed between points whose 
30 abscissa is the frequency corresponding to the maximum 
of the amplitude of a spectral peak, and whose ordinate 
is this maximum, before or after compression. 

The interpolation performed to calculate the upper 
35 envelope LX__sup is a simple linear interpolation. Of 
course, some other form of interpolation could be used 
(for example polynomial or spline) . 

In the preferred variant represented in figure 5, the 



10 



15 



20 



25 



- 9 - 



interpolation is performed between points whose 
abscissa is a frequency k.Fo which is a multiple of the 
fundamental frequency (in fact the closest frequency in 
the discrete spectrum) and whose ordinate is the 
5 maximum amplitude, before or after compression, of the 
spectrum in the neighborhood of this multiple 
frequency. 

By comparing figures 4 and 5, it may be seen that the 
10 mode of extraction according to figure 5, which 
repositions the peaks on the harmonic frequencies, 
leads to better accuracy with regard to the amplitude 
of the peaks which will be attributed by the decoder to 
the frequencies which are multiples of the fundamental 
15 frequency. A slight frequency displacement may occur in 
the position of these peaks, this not being very 
significant perceptually and anyway not being avoided 
either in the case of figure 4. In the' case of figure 
4, the anchoring points for the interpolation are one 
20 and the same as the vertices of the harmonic peaks. In 
the case of figure 5, these anchoring points must lie 
precisely at the frequencies which are multiples of the 
fundamental frequency, their amplitudes corresponding 
to those of the peaks . 



25 



The search interval for the amplitude maximum 
associated with a harmonic of rank k is centered on the 
index i of the frequency of the FFT closest to k.Fo, 

where |_aj designates the integer 



I.e. 1 = 



F 1 
2Nk ^ + i 
F. 2 



30 equal to or immediately less than the number a. The 
width of this search interval depends on the sampling 
frequency Fe, on the size 2N of the FFT and on the 
possible range of variation of the fundamental 
frequency. This width is typically of the order of some 

35 ten frequencies with the exemplary values considered 
earlier. It may be rendered adjustable as a function of 
the value Fq of the fundamental frequency and of the 
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number k of the harmonic. 

In order to improve the resolution in the low 
frequencies and hence to more faithfully represent the 
amplitudes of the harmonics in this zone, a nonlinear 
distortion of the frequency scale is carried out on the 
compressed upper envelope by a module 12 before the 
module 13 performs the inverse fast Fourier transform 
(IFFT) providing the cepstral coefficients cx_sup. 

The nonlinear distortion allows more efficient 
minimization of the modeling error. It is, for example, 
performed on a frequency scale of Mel or Bark type. 
This distortion may possibly depend on the estimated 
fundamental frequency Fq. Figure 1 illustrates the case 
of the Mel scale. The relation between the frequencies 
F of the linear spectrum, expressed in hertz, and the 
frequencies F' of the Mel scale is as follows: 



In order to limit the transmission bit rate, a 
truncation of the cepstral coefficients cx_sup is 
performed. The IFFT module 13 need only calculate a 
cepstral vector of NCS cepstral coefficients of orders 
0 to NCS-1. By way of example, NCS may be equal to 16. 

Post-filtering in the cepstral domain, referred to as 
post-liftering, is applied by a module 15 to the 
compressed upper envelope LX_sup. This post-lif tering 
corresponds to a manipulation of the cepstral 
coefficients cx_sup delivered by the IFFT module 13, 
which corresponds approximately to a post-filtering of 
the harmonic part of the signal by a transfer function 
having the conventional form: 



where A(z) is the transfer function of a filter for 
linear prediction of the audio signal, yi and 72 are 




(6) 




(7) 
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coefficients lying between 0 and 1, and p is a pre- 
emphasizing coefficient, possibly zero. The relation 
between the post-lif tered coefficient of order i, 
denoted .Cp(i), and the corresponding cepstral 
coefficient c(i) = cx_sup(i) delivered by the module 13 
is then: 

Cp(0) = c(0) 



Cp(i) = ^ + Y2 - Yipa) for i > 0 



The optional pre-emphasizing coefficient may be 

10 controlled by setting as constraint the preserving of 
the value of the cepstral coefficient cx_sup(l) 
relating to the slope. Specifically, the value of 
c(l) = cx_sup(l) of white noise filtered by the pre- 
emphasizing filter corresponds to the pre-emphasizing 
5 15 coefficient. The latter may thus be chosen as follows: 
p ^ = (Ya-Yi) -c (1) . 

After the post-lifter 15, a normalizing module 16 again 
fi modifies the cepstral coefficients by imposing the 

20 constraint of exact modeling of a point of the initial 
spectrum, which is preferably the point of greatest 
energy from among the spectral maxima extracted by the 
module 9. In practice, this normalization modifies only 
the value of the coefficient Cp(0). 

25 

The normalizing module 16 operates as follows: it 
recalculates a value of the synthesized spectrum at the 
frequency of the maximum indicated by the module 9, by 
Fourier transform of the truncated and post-liftered 
30 cepstral coefficients, taking into account the 
nonlinear distortion of the frequency axis; it 
determines a normalizing gain gN through the 
logarithmic difference between the value of the maximum 
as delivered by the module 9 and this value 
35 recalculated; and it adds the gain gw to the post- 
liftered cepstral coefficient Cp(0). This normalization 
may be viewed as being part of the post-lif tering. 
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The post-liftered and normalized cepstral coefficients 
form the subject of a quantization by a module 18 which 
transmits corresponding quantization indices icxs to 
5 the output multiplexer 6 of the coder. 

The module 18 can operate by vector quantization on the 
basis of cepstral vectors formed of post-liftered and 
normalized coefficients, here denoted cx[n] for the 
10 signal frame of rank n. By way of example, the cepstral 
vector cx[n] of NCS = 16 cepstral coefficients cx[n,0], 
cx[n,l], cx[n,NCS-l] is distributed as four 

cepstral subvectors each containing four coefficients 
H= of consecutive orders. The cepstral vector cx[n] can be 

15 processed by the means represented in figure 6, forming 
part of the quantization module 18. These means 
implement, for each component cx[n,i], a predictor of 
the form: 

cxp[n,i] = (l-a(i) ) .rcx[n,i] + a(i) .rcx[n-l, i] (9) 
20 where rcx[n] designates a residual prediction vector 
for the frame of rank n whose components are 
respectively denoted rcx[n,0], rcx[n,l], 
rcx[n,NCS-l] , and a(i) designates a prediction 
coefficient chosen so as to be representative of an 
25 assumed inter-frame correlation. After quantization of 
the residuals, this residual vector is defined by: 

cx[n, i] - a(i) JTCx _ q[n - 1, i] 
rcx[n,i] = ^-^^ (10) 

where rcx_q[n-l] designates the quantized residual 
vector for the frame of rank n-1, whose components are 
30 respectively denoted rcx_q[n, 0] , rcx_q[n, 1] , - . ., 
rcx_q[n,NCS-l] . 

The numerator of relation (10) is obtained by a 
subtractor 20, whose output vector components are 
35 divided by the quantities 2-a(i) at 21. For 
quantization purposes, the residual vector rcx[n] is 
subdivided into four subvectors, corresponding to the 
subdivision into four cepstral subvectors. On the basis 
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of a dictionary obtained by prior learning, the unit 22 
undertakes the vector quantization of each subvector of 
the residual vector rcx[n]- This quantization can 
consist, for each subvector srcx [n] , in selecting from 
5 the dictionary the quantized subvector srcx_q[n] which 
minimizes the quadratic error j|srcx[n] - srcx _ q[n]||^ . The 
set icxs of quantization indices icx, corresponding to 
the addresses in the dictionary or dictionaries of the 
quantized residual subvectors srcx_q[n], is provided to 
10 the output multiplexer 6. 

The unit 22 also delivers the values of the quantized 
residual subvectors, which form the vector rcx_q[n] . 
The latter is delayed by one frame at 23, and its 

15 components are multiplied by the coefficients a(i) at 
24 so as to provide the vector to the negative input of 
the subtracter 20. The latter vector is, on the other 
hand, provided to an adder 25, the other input of which 
receives a vector formed by the components of the 

20 quantized residual rcx_q[n], respectively multiplied by 
the quantities l-a(i) at 26. The adder 25 thus delivers 
the quantized cepstral vector cx__q[n] which will be 
recovered by the decoder. 

25 The prediction coefficient a(i) can be optimized 
separately for each of the cepstral coefficients. The 
quantization dictionaries may also be optimized 
separately for each four cepstral subvectors. Moreover, 
it is possible, in a manner known per se, to normalize 

30 the cepstral vectors before applying the 
prediction/quantization scheme, on the basis of the 
variance of the cepstra. 

It should be noted that the above scheme for quantizing 
35 the cepstral coefficients cannot be applied other than 
in respect of certain of the frames only. For example, 
provision may be made for a second mode of quantization 
as well as a process for selecting that one of the two 



modes which minimizes a least squares criterion with 
the cepstral coefficients to be quantized, and a bit 
indicating which of the two modes has been selected may 
be transmitted with the frame quantization indices. 

The quantized cepstral coefficients cx_sup_q = cx__q[n] 
provided by the adder 25 are addressed to a module 28 
which recalculates the spectral amplitudes associated 
with one or more of the harmonics of the fundamental 
frequency Fq (figure 1) . These spectral amplitudes are, 
for example, calculated in compressed form, by applying 
the Fourier transform to the quantized cepstral 
coefficients while taking account of the nonlinear 
distortion of the frequency scale used in the cepstral 
transformation- The amplitudes thus recalculated are 
provided to an adaptation module 29 which compares them 
with amplitudes of maxima determined by the extraction 
module 9 . 

The adaptation module 29 controls the post-lifter 15 in 
such a way as to minimize a discrepancy in modulus 
between the spectrum of the audio signal and the 
corresponding modulus values calculated at 28. This 
discrepancy in modulus can be expressed by a sum of 
absolute values of differences of amplitudes, 
compressed or otherwise, corresponding to one or more 
of the harmonic frequencies* This sum can be weighted 
as a function of the spectral amplitudes associated 
with these frequencies. 

Optimally, the discrepancy in modulus taken into 
account in the adaptation of the post-lif tering would 
take account of all the harmonics of the spectrum. 
However, in order to reduce the complexity of the 
optimization, the module 28 can resynthesize the 
spectral amplitudes for just one or more frequencies 
which are multiples of the fundamental frequency Fq and 
which are selected on the basis of the magnitude of the 
modulus of the spectrum in absolute value. The 



adaptation module 29 can, for example, consider the 
three most intense spectral peaks in the calculation of 
the discrepancy in modulus to be minimized. 

In another embodiment, the adaptation module 29 
estimates a curve of spectral masking of the audio 
signal by means of a psycho-acoustic model, and the 
frequencies taken into account in the calculation of 
the discrepancy in modulus to be minimized are selected 
on the basis of the magnitude of the modulus of the 
spectrum in relation to the masking curve (it is, for 
example, possible to take the three frequencies for 
which the modulus of the spectrum most exceeds the 
masking curve) . Various conventional methods can be 
used to calculate the masking curve from the audio 
signal . It is, for example, possible to use that 
developed by J.D. Johnston (^^Transform Coding of Audio 
Signals Using Perceptual Noise Criteria", IEEE Journal 
on Selected Area in Communications, Vol. 6, No. 2, 
February 1988) . 

To carry out the adaptation of the post-lif tering, the 
module 29 can use a filter identification model. A 
simpler method consists in predefining a collection of 
sets of post-lif tering parameters, that is to say a 
collection of pairs yi, 72 in the case of post-lif tering 
according to relations (8) , in performing the 
operations incumbent on the modules 15, 16, 18 and 28 
for each of these sets of parameters, and in retaining 
that of the sets of parameters which leads to the 
minimum discrepancy in modulus between the spectrum of 
the signal and the recalculated values. The 
quantization indices provided by the module 18 are then 
those which relate to the best set of parameters. 

By a process similar to that for extracting the 
coefficients cx_sup representing the compressed upper 
envelope LX_sup of the spectrum of the signal, the 
coder determines the coefficients cx_inf representing a 



compressed lower envelope LX_inf. A module 30 extracts 
from the compressed spectrum LX, spectral amplitudes 
associated with frequencies situated in zones of the 
spectrum which are intermediate with respect to the 
frequencies which are multiples of the estimated 
fundamental frequency Fq- 

In the example illustrated by figures 4 and 5, each 
amplitude associated with a frequency situated in a 
zone intermediate between two successive harmonics k.Fo 
and (k+l) .Fo corresponds simply to the modulus of the 
spectrum for the frequency (k+1/2) .Fo situated in the 
middle of the interval separating the two harmonics . In 
another embodiment, this amplitude could be an average 
of the modulus of the spectrum over a small span 
surrounding this frequency (k+1/2) .Fq. 

A module 31 carries out an interpolation, for example 
linear, of the spectral amplitudes associated with the 
frequencies situated in the intermediate zones so as to 
obtain the compressed lower envelope LX_inf . 

The cepstral transformation applied to this compressed 
lower envelope LX_inf is performed according to a 
frequency scale resulting from a nonlinear distortion 
applied by a module 32. The IFFT module 33 calculates a 
cepstral vector of NCI cepstral coefficients cx_inf of 
orders 0 to NCI-1 representing the lower envelope. NCI 
is a number which may be substantially smaller than 
NCS, for example NCI = 4. 

The nonlinear transformation of the frequency scale for 
the cepstral transformation of the lower envelope can 
be carried out to a scale which is finer at the high 
frequencies than at the low frequencies, thereby 
advantageously allowing good modeling of the unvoiced 
components of the signal at the high frequencies. 
However^ to ensure homogeneity of representation 
between the upper envelope and the lower envelope, the 
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same scale will preferably be adopted in the module 32 
as in the module 12 (Mel in the example considered) . 

The cepstral coefficients cx_inf representing the 
5 compressed lower envelope are quantized by a module 34, 
which may operate in the same manner as the module 18 
for quantizing the cepstral coefficients representing 
the compressed upper envelope. In the case considered, 
where we restricted ourselves to NCI = 4 cepstral 

10 coefficients for the lower envelope, the vector thus 
formed is subjected to a prediction residual vector 
quantization performed by means identical to those 
represented in figure 6 but without subdivision into 
subvectors. The quantization index icx = icxi 

15 determined by the vector quantizer 22 for each frame 
relating to the coefficients cx_inf is provided to the 
output multiplexer 6 of the coder. 

The coder represented in figure 1 does not comprise any 
20 particular device for coding the phases of the spectrum 
at the harmonics of the audio signal. 

On the other hand, it comprises means 36-40 for coding 
time information related to the phase of the 
25 nonharmonic component represented by the lower 
envelope . 

A spectral decompression module 36 and an IFFT module 
37 form a temporal estimate of the frame of the non- 
30 harmonic component. The module 36 applies a 
decompression function which is the reciprocal of the 
compression function applied by the module 8 (that is 
to say an exponential or a 1/y power function) to the 
compressed lower envelope LX_inf produced by the 
35 interpolation module 31. This provides the modulus of 
the estimated frame of the nonharmonic component, whose 
phase is taken equal to that cpx of the spectrum of the 
signal X over the frame. The inverse Fourier transform 
performed by the module 37 provides the estimated frame 
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of the nonharmonic component. 

The module 38 subdivides this estimated frame of the 
nonharmonic component into several time segments. The 
frame delivered by the module 37 being made up of 
2N = 512 weighted samples, as illustrated by figure 3, 
the module 38 considers only the first N/2 = 128 
samples and the last N/2 = 128 samples, and subdivides 
them, for example, into eight segments of 32 
consecutive samples each representing 4 ms of signal. 

For each segment, the module 38 calculates the energy 
equal to the sum of the squares of the samples, and 
forms a vector El formed of eight positive real 
components equal to the eight calculated energies. The 
largest of these eight energies, denoted EM, is also 
determined so as to be provided, with the vector El, to 
a normalizing module 39. The latter divides each 
component of the vector El by EM, so that the 
normalized vector Emix is formed of eight components 
lying between 0 and 1. It is this normalized vector 
Emix, or weighting vector, which is subjected to the 
quantization by the module 40. The latter can carry out 
a vector quantization with a dictionary determined 
during prior learning. The quantization index iEm is 
provided by the module 40 to the output multiplexer 6 
of the coder. 

Figure 7 shows a variant embodiment of the means 
employed by the coder of figure 1 to determine the 
energy weighting vector Emix for the frame of the non- 
harmonic component. The spectral decompression and IFFT 
modules 36, 37 operate like those which bear the same 
references in figure 1. A selection module 42 is added 
so as to determine the value of the modulus of the 
spectrum subjected to the inverse Fourier transform 37. 
On the basis of the estimated fundamental frequency Fq, 
the module 42 identifies harmonic regions and non- 
harmonic regions of the spectrum of the audio signal . 
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For example, a frequency will be regarded as belonging 
to a harmonic region if it is located in a frequency 
interval centered on a harmonic k.Fo and of width 
corresponding to a synthesized spectral line width, and 
to a nonharmonic region otherwise. In the nonharmonic 
regions, the complex signal subjected to the IFFT 37 is 
equal to the value of the spectrum, that is to say its 
modulus and its phase correspond to the values |X| and 
(px provided by the EFT module 3. In the harmonic 
regions, this complex signal has the same phase (px as 
the spectrum and a modulus given by the lower envelope 
after spectral decompression 36. Proceeding thus 
according to figure 7 achieves more accurate modeling 
of the nonharmonic regions . 

The decoder represented in figure 8 comprises an input 
demultiplexer 45 which extracts from the binary stream 
emanating from a coder according to figure 1, the 
quantization indices iF, icxs, icxi, iEm for the 
fundamental frequency Fq, the cepstral coefficients 
representing the compressed upper envelope, the 
coefficients representing the compressed lower 
envelope, and the weighting vector Emix, and 
distributes them respectively to modules 46, 47, 48 and 
49. These modules 46-49 comprise quantization 
dictionaries similar to those of the modules 5, 18, 34 
and 40 of figure 1, so as to restore the values of the 
quantized parameters. The modules 47 and 48 have 
dictionaries so as to form the quantized prediction 
residuals rcx_q[n], and they deduce therefrom the 
quantized cepstral vectors cx_q[n] with elements 
identical to the elements 23-26 of figure 6. These 
quantized cepstral vectors cx_q[n] provide the cepstral 
coefficients cx_sup_q and cx_inf_q processed by the 
decoder. 

A module 51 calculates the fast Fourier transform of 
the cepstral coefficients cx_sup for each signal frame. 
The: frequency scale of the compressed spectrum 
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resulting therefrom is modified nonlinearly by a module 
52 applying the nonlinear transformation reciprocal to 
that of the module 12 of figure 1, and which provides 
the estimate LX_sup of the compressed upper envelope. A 
spectral decompression of LX_sup, carried out by a 
module 53, provides the upper envelope X_sup comprising 
the estimated values of the modulus of the spectrum at 
the frequencies which are multiples of the fundamental 
frequency Fq. The module 54 synthesizes the spectral 
estimate Xv of the harmonic component of the audio 
signal, through a sum of spectral lines centered on the 
frequencies which are multiples of the fundamental 
frequency Fq and whose amplitudes (in modulus) are 
those given by the upper envelope X_sup. 

Although the digital input stream does not comprise 
any specific information regarding the phase of the 
spectrum of the signal at the harmonics of the 
fundamental frequency, the decoder of figure 8 is 
capable of extracting information regarding this phase 
from the cepstral coefficients cx_sup_q representing 
the compressed upper envelope. This phase information 
is used to assign a phase (p(k) to each of the spectral 
lines determined by the module 54 in the estimate of 
the harmonic component of the signal. 

As a first approximation, the speech signal may be 
regarded as being of minimum phase. Moreover, it is 
known that the minimum phase information may be deduced 
easily from cepstral modeling. This minimum phase 
information is therefore calculated for each harmonic 
frequency. The minimum phase assumption signifies that 
the energy of the synthesized signal is localized at 
the start of each period of the fundamental frequency 

In order to be closer to a real speech signal, slight 
dispersion is introduced by means of a specific post- 
liftering of the cepstra during synthesis of the phase. 
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With this post-liftering, performed by the module 55 of 
figure 8, it is possible to emphasize the formant 
resonances of the envelope and hence to control the 
dispersion of the phases. This post-lif tering is, for 
example, of the form (8) . 

To limit the phase breaks, it is preferable to smooth 
the post-liftered cepstral coefficients, this being 
performed by the module 56. The module 57 deduces from 
the post-liftered and smoothed cepstral coefficients 
the minimum phase assigned to each spectral line 
representing a harmonic peak of the spectrum. 

The operations performed by the modules 56, 57 for 
smoothing and extracting the minimum phase are 
illustrated by the flowchart of figure 9. The module 56 
examines the variations in the cepstral coefficients so 
as to apply lesser smoothing in the presence of abrupt 
variations than in the presence of slow variations. To 
do this, it performs the smoothing of the cepstral 
coefficients by means of a forget factor Xc chosen as a 
function of a comparison between a threshold dth and a 
distance d between two successive sets of post-liftered 
cepstral coefficients. The threshold dth is itself 
adapted as a function of the variations of the cepstral 
coefficients . 

The first step 60 consists in calculating the distance 
d between the two successive vectors relating to frames 
n-1 and n. These vectors, here denoted cxp[n-l] and 
cxp[n], correspond for each frame to the collection of 
NCS post-liftered cepstral coefficients representing 
the compressed upper envelope. The distance used may in 
particular be the Euclidean distance between the two 
vectors or else a quadratic distance. 

Two smoothings are firstly performed, respectively by 
means of forget factors X^m and A-max. so as to determine 
a minimum distance dmin ^^id a maximum distance dmax- The 
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threshold dth is then determined in step 70 as being 
situated between the minimum and maximum distances dmin^ 
d„^x: dth = P.dmax + (1-P) .dn,in, the coefficient p being, 
for example, equal to 0.5. 

In the example represented, the forget factors Xmin and 
;^ax are themselves selected from among two distinct 
values, respectively A^mini^ htd.n2 and Xmaxir ^niax2 lying 
between 0 and 1, the indices Kd.nir ^maxi each being 
substantially nearer to 0 than the indices T^nir ^max2- 
If d > dndn (test 61), the forget factor A^n is equal to 
A-mini (step 62); otherwise, it is taken equal to A^n2 
(step 63) . In step 64, the minimum distance dioin is 
taken equal to ^n-drtdn + U-Knn) -d. If d > d„^x (test 
65), the forget factor X^x is equal to Xmaxi (step 66); 
otherwise, it is taken equal to Xmax2 (step 67) . In 
step 68, the minimum distance dmax is taken equal to 

^Mx • dmax ( i ""^ax ) • d . 

If the distance d between the two consecutive cepstral 
vectors is greater than the threshold dtn (test 71), 
then a value Xci relatively close to 0 is adopted for 
the forget factor (step 72). In this case, the 

corresponding signal is regarded as being of 
nonstationary type, so that there is no need to keep a 
large memory of the earlier cepstral coefficients. If 
d < dth/ a value Xc2 which is not as close to 0 is 
adopted in step 73 for the forget factor Xc, so as to 
further smooth the cepstral coefficients. The smoothing 
is performed in step 74, where the vector cxltn] of 
smoothed coefficients for the current frame n is 

determined by: 

cxl[n] = A,c.cxl[n-1] + (1-Xc).cxp[n] (11) 

The module 57 then calculates the minimum phases (p(k) 
associated with the harmonics k.Fo. In a known manner, 
the minimum phase for a harmonic of order k is given 
by: 
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NCS-1 

(p(k) = -2 . 2 cxl[n, m] . sin (27Cmk Fq / Fe) ( 12 ) 

m = l 

where cxl[n,m] designates the smoothed cepstral 
coefficient of order m for frame 

In step 75, the harmonic index k is initialized to 1. 
To initialize the calculation of the minimum phase 
assigned to harmonic k, the phase <p(k) and the cepstral 
index m are initialized to 0 and 1 respectively in 
step 76. In step 77, the module 57 adds the quantity 
-2.cxl[n,m] .sin(27imk.Fo/Fe) to the phase 9(k). The 
cepstral index m is incremented in step 78 and compared 
with NCS in step 79. Steps 77 and 78 are repeated so 
long as m < NCS. When m = NCS, the calculation of the 
minimum phase is terminated for harmonic k, and the 
index k is incremented in step 80. The calculation of 
minimum phases 7 6-7 9 is rerun for the next harmonic so 
long as k.Fo < Fe/2 (test 81). 

In the exemplary embodiment according to figure 8, the 
module 54 takes account of a constant phase over the 
width of each spectral line, equal to the minimum phase 
(p(k) provided for the corresponding harmonic k by the 
module 57 . 

The estimate Xv of the harmonic component is 
synthesized by summation of spectral lines positioned 
at the harmonic frequencies of the fundamental 
frequency Fo- During this synthesis, it is possible to 
position the spectral lines on the frequency axis with 
a higher resolution than the resolution of the Fourier 
transform. To do this, a reference spectral line is 
precalculated once and for all according to the higher 
resolution. This calculation can consist of a Fourier 
transform of the analysis window Fa with a transform 
size of 16 384 points, achieving a resolution of 0.5 Hz 
per point. The synthesis of each harmonic line is then 
performed by the mo<iule 54 by positioning on the 
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frequency axis the reference line with high resolution, 
and by undersampling this reference spectral line so as 
to reduce to the resolution of 16.625 Hz of the Fourier 
transform on 512 points. This enables the spectral line 
5 to be positioned accurately- 

For the determination of the lower envelope, the FFT 
module 85 of the decoder of figure 8 receives the NCI 
quantized cepstral coefficients cx_inf_q of orders 0 to 

10 NCI - If and it advantageously supplements them with 
the NCS - NCI cepstral coefficients cx_sup_q of order 
NCI to NCS ~ 1 representing the upper envelope. 
Specifically, it may be estimated that, as a first 
approximation, the fast variations of the compressed 

15 lower envelope are well reproduced by those of the 
compressed upper envelope. In another embodiment, the 
FFT module 85 could consider only the NCI cepstral 
parameters cx__inf_q. 

20 The module 86 converts the frequency scale in a manner 
reciprocal to the conversion carried out by the module 
32 of the coder, so as to restore the estimate LX_inf 
of the compressed lower envelope, subjected to the 
spectral decompression module 87. At the output of the 

25 module 87, the decoder is furnished with a lower 
envelope X_inf comprising the values of the modulus of 
the spectrum in the valleys situated between the 
harmonic peaks . 

30 This envelope X_inf will modulate the spectrum of a 
noise frame whose phase is processed as a function of 
the quantized weighting vector Emix extracted by the 
module 49. A generator 88 delivers a normalized noise 
frame whose 4-ms segments are weighted in a module 89 

35 in accordance with the normalized components of the 
vector Emix provided by the module 4 9 for the current 
frame. This noise is white noise high-pass filtered so 
as to take account of the low level which in principle 
the unvoiced component has at the low frequencies. On 



the basis of the energy-weighted noise, the module 90 
forms frames of 2N = 512 samples by applying the 
analysis window fa/ the insertion of 256 zero samples 
and the circular permutation for phase compensation in 
accordance with what was explained with reference to 
figure 3. The Fourier transform of the resulting frame 
is calculated by the FFT module 91. 

The spectral estimate Xuv of the nonharmonic component 
is determined by the spectral synthesis module 92 which 
performs a frequency-by-frequency weighting. This 
weighting consists in multiplying each complex spectral 
value provided by the FFT module 91 by the value of the 
lower envelope X_inf obtained for the same frequency by 
the spectral decompression module 87 . 

The spectral estimates Xv, Xuv of the harmonic (voiced 
in the case of a speech signal) and nonharmonic (or 
unvoiced) components are combined by a mixing module 95 
controlled by a module 96 for analyzing the degree of 
harmonicity (or of voicing) of the signal. 

The organization of these modules 95, 96 is illustrated 
by figure 10. The analysis module 96 comprises a unit 
97 for estimating a frequency-dependent degree of 
voicing W from which are calculated four frequency- 
dependent gains, namely two gains gv^ guv controlling 
the relative magnitude of the harmonic and nonhainmonic 
components in the synthesized signal, and two gains 
gv__(pf guv_cp used to add noise to the phase of the 
harmonic component . 

The degree of voicing W(i) is a continuously varying 
value lying between 0 and 1 determined for each 
frequency index i (0 < i < N) as a function of the 
upper envelope X_sup(i) and of the lower envelope 
X_inf (i) which are obtained for this frequency i by the 
decompression modules 53, 87. The degree of voicing 
W(i) is estimated by the unit 97 for each frequency 



index i corresponding to a harmonic of the fundamental 



frequency Fo^ namely i = 



2Nk^ + i 
2_ 



for k = 1, 2, 



by an increasing function of the ratio of the upper 
envelope X_sup to the lower envelope X_inf at this 
frequency, for example according to the formula: 

W(i) = ^J^^0-^O9^A>^-^-P(iUX_ infix)]) ,^3, 



Vth(FJ 



The threshold Vth(Fo) corresponds to the average 
dynamic swing calculated over a purely voiced synthetic 
spectrum at the fundamental frequency. It is 
advantageously chosen to be dependent on the 
fundamental frequency Fq. 

The degree of voicing W{i) for a frequency other than 
the harmonic frequencies is obtained simply as being 
equal to that estimated for the closest harmonic. 

The gain gv(i)f which depends on the frequency, is 

obtained by applying a nonlinear function to the degree 

of voicing W(i) (block 98) . This nonlinear function 

has, for example, the form represented in figure 11: 

gv(i) = 0 if 0 < W(i) < Wl 
W(i) — Wl 

g (i) = -J-i if Wl < W(i) < W2 (14) 

W2 - Wl 

gv(i) = 1 if W2 < W(i) < 1 
the thresholds Wl, W2 being such that 0 < Wl < W2 < 1. 
The gain guv can be calculated in a similar manner to 
the gain gv (the sum of the two gains gv, guv being 
constant, for example equal to 1) , or deduced simply 
from the latter through the relation guv(i) = 1 - gv(i)f 
as shown diagraramatically by the subtracter 99 in 
figure 10. 

It is beneficial to be able to add noise to the phase 
of the harmonic component of the signal at a given 
frequency if the analysis of the degree of voicing 
shows that the signal is actually of nonharmonic type 
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at this frequency- To do this, the phase (Py of the 
mixed harmonic component is the result of a linear 
combination of the phases (p^, <puv of the harmonic and 
nonharmonic components X^, Xuv synthesized by the 
5 modules 54, 92. 

The gains gv_q)/ <3uv_(p respectively applied to these 
phases are calculated from the degree of voicing W and 
also weighted as a function of the frequency index i, 
10 given that the adding of noise to the phase is actually 
useful only beyond a certain frequency. 

= A first gain gvi_(p is calculated by applying a nonlinear 

O function to the degree of voicing W(i), as shown 

, H 15 diagrammatically by the block 100 in figure 10. This 

yji nonlinear function can have the form represented in 

S figure 12: 

m gvi_q>(i) = Gl if 0 < W(i) < W3' 

' L gvi (p(i) = Gl + (1 - Gl) ^^^^ " if W3 < W(i) < W4 (15) 

^ LI ^ W4 - W3 

; 5 20 gvi_jcp(i) =1 if W4 < W(i) < 1 

^ the thresholds W3 and W4 being such that 0 < W3 < W4 

i . ^ < If and the minimum gain Gl lying between 0 and 1. 

J 

d 

9 

[ A multiplier 101 multiplies for each frequency of index 
I 25 i the gain gvi__(p by another gain gv2_(p dependent only on 

j the frequency index i, so as to form the gain gv_cp(i) • 

1 The gain gv2_<p(i) depends nonlinearly on the frequency 

I index i, for example as indicated in figure 13: 

1 gv2_<p(i) i if 0 S i < il 

1 30 gv2 <p(i) = 1 - (1 - G2) ^ " if il < i < 12 (16) 

i - 12 - il 

I gv2_<p(i) = G2 if 12 < i < 1 

1 the indices il and 12 being such that 0 < il < 12 < N, 

I and the minimum gain G2 lying between 0 and 1 . The gain 

1 guv_(p(i) can be calculated simply as being equal to 

I 35 1 - gv_(p(i) = 1 - gvi_(p(i) •gv2_cp(i) (subtracter 102 of 

J figure 10) . 
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The complex spectrum Y of the synthesized signal is 
produced by the mixing module 95, which carries out the 
following mixing relation, for 0 < i < N: 

Y(i) = gv(i) . IXv(i) I .exp[j(Pv (i) ]+guv(i) .Xuv(i) (17) 
5 with 9v (i) = gv_(p(i) -cpvd) + guv_(p (i) .9uv (i) (18) 

where (pv(i) designates the argument of the complex 
number Xv(i) provided by the module 54 for the 
frequency of index i (block 104 of figure 10), and 
9uv(i) desginates the argument of the complex number 
10 Xuv(i) provided by the module 92 (block 105 of 
figure 10) . This combination is carried out by the 
multipliers 106-110 and the adders 111-112 represented 
in figure 10. 

15 The mixed spectrum Y(i) for 0 < i < 2N (with Y(2N-l-i) 
= Y(i)) is then transformed into the time domain by the 
IFFT module 115 (figure 8) . Only the first N/2 = 128 
m and the last N/2 = 128 samples of the frame of 2N = 512 

1,. samples produced by the module 115 are retained, and 

ii 20 the circular permutation inverse to that illustrated by 

D figure 3 is applied to obtain the synthesized frame of 

S N = 256 samples weighted by the analysis window f^- 

The frames obtained successively in this manner are 
25 finally processed by the temporal synthesis module 116 
which forms the decoded audio signal x . 

The temporal synthesis module 116 performs an overlap 
sum of frames modified with respect to those evaluated 
30 successively at the output of the module 115. The 
modification may be viewed in two steps illustrated by 
figures 14 and 15 respectively. 

The first step (figure 14) consists in multiplying each 
35 frame 2' delivered by the IFFT module 115 by a window 
1/fA inverse to the analysis window fA employed by the 
module 1 of the coder. The samples of the frame 2" 
resulting therefrom are therefore uniformly weighted. 
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The second step (figure 15) consists in multiplying the 
samples of this frame 2" by a synthesis window fs 
satisfying the following properties: 
5 . fs(N-L+i) + fs(i) = A for 0 < i < L (19) 

fs(i) = A for L < i < N-L (20) 
where A designates an arbitrary positive constant, for 
example A = 1. The synthesis window fs(i) increases 
progressively from 0 to A for i going from 0 to L. It 
10 is, for example, a raised half-sinusoid: 

fs(i) = ^ • " cos[(i + 1 / 2)n; / L] ) for 0 < i < L (21) 

After having reweighted each frame 2" by the synthesis 
window fs, the module 116 positions the successive 

15 frames with their time shifts of M = 160 samples and 
their time overlaps of L = 96 samples, then it sums the 
frames thus positioned over time. . Owing to the 
properties (19) and (20) of the synthesis window fs, 
each sample of the decoded audio signal x thus obtained 

20 is assigned a uniform global weighty equal to A. This 
global weight originates from the contribution of a 
single frame if the sample has in this frame a rank i 
such that L < i < N - L, and comprises the summed 
contributions of two successive frames if 0 < i < L 

25 where N - L < i < N. 

It is thus possible to perform the temporal synthesis 
in a simple manner even if, as in the case considered, 
the overlap L between two successive frames is smaller 
30 than half the size N of these frames. 

The two steps set forth above for modifying the signal 
frames may be merged into a single step. It is 
sufficient to precalculate a compound window 
35 fc(i) = fs(i)/fA(i) and simply to multiply the frames 2' 
of N = 256 samples delivered by the module 115 by the 
compound window fc before performing the overlap 
summation. 
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Figure 16 shows the shape of the compound window fc in 
the case where the analysis window fA is a Hamming 
window and the synthesis window fs has the form given 
5 by relations (19) to (21) . 

Other forms of the synthesis window fs satisfying 
relations (19) and (20) may be employed. In the variant 
of figure 17, it is a piecewise affine function defined 
10 by: 

fs(i) = A.i/L for 0 < i < L (22) 



In order to improve the quality of coding of the audio 
Q signal, the coder of figure 1 can increase the rate of 

15 formation and of analysis of the frames, so as to 
transmit more quantization parameters to the decoder. 
In the frame structure represented in figure 2, a frame 
of N = 256 samples (32 ms) is formed every 20 ms. These 
frames of 256 samples could be formed at a higher rate, 
20 for example 10 ms, two successive frames then having a 
shift of M/2 = 80 samples and an overlap of 17 6 
samples . 

Under these conditions, it is possible to transmit the 
25 complete sets of quantization parameters iF, icxs, 
icxi, iEm for just one subcollection of frames, and to 
transmit, for the other frames, parameters making it 
possible to perform a suitable interpolation at the 
level of the decoder. In the example envisaged 
30 hereinabove, the subcollection for which complete 
parameter sets are transmitted may consist of the 
frames of integer rank n, whose periodicity is 
M/Fe = 20 ms, and the frames for which an interpolation 
is performed may be those of half-integer rank n + 1/2 
35 which are shifted by 10 ms with respect to the frames 
of the subcollection. 

In the embodiment illustrated by figure 18, the 
notation cx_q[n~l] and cx_q[n3 designates quantized 
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cepstral vectors determined, for two successive frames 
of integer rank, by the quantization module 18 and/or 
by the quantization module 34. These vectors comprise, 
for example, four consecutive cepstral coefficients 
each. They could also comprise more cepstral 
coefficients . 

A module 120 performs an interpolation of these two 
cepstral vectors cx_q[n-l] and cx_q[n] so as to 
estimate an intermediate value cx_i[n-l/2]. The 
interpolation performed by the module 120 can be a 
simple arithmetic average of the vectors cx_q[n-l] and 
cx_q[n] . As a variant, the module 120 could apply a 
more sophisticated interpolation formula, for example 
polynomial, based also on the cepstral vectors obtained 
for frames earlier than frame n-1. Moreover, if more 
than one interpolated frame is interposed between two 
consecutive frames of integer rank, the interpolation 
takes account of the relative position of each 
interpolated frame. 

With the aid of the means described above, the coder 
also calculates the cepstral coefficients cx[n-l/2] 
relating to the frame of half -integer rank. In the case 
of the upper envelope, these cepstral coefficients are 
those provided by the IFFT module 13 after post- 
liftering 15 (for example with the same post-lif tering 
coefficients as for the previous frame n-1) and 
normalization 16. In the case of the lower envelope, 
the cepstral coefficients cx[n-l/2] are those delivered 
by the IFFT module 33. 

A subtracter 121 forms the difference ecx[n-l/2] 
between the cepstral coefficients cx[n~l/2] calculated 
for the frame of half-integer rank and the coefficients 
cx_i[n-l/2] estimated by interpolation. This difference 
is provided to a quantization module 122 which 
addresses quantization indices icx[n-l/2] to the output 
multiplexer 6 of the coder. The module 122 operates. 
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for example, by vector quantization of . the 
interpolation errors ecx[n-l/2] determined successively 
for the frames of half -integer rank. 

5 This quantization of the interpolation error can be 
performed by the coder for each of the NCS + NCI 
cepstral coefficients used by the decoder, or for just 
some of them, typically those of smallest orders. 

10 The corresponding means of the decoder are illustrated 
by figure 19. The decoder operates essentially like 
that described with reference to figure 8 to determine 
the signal frames of integer rank. An interpolation 
module 124 identical to the module 120 of the coder 

15 estimates the intermediate coefficients cx_i[n-l/2] 
from the quantized coefficients cx_q[n-l] and cx_q[n] 
provided by the module 47 and/or the module 48 from the 
indices icxs, icxi extracted from the stream A 
module for extracting parameters 125 receives the 

20 quantization index icx[n-l/2] from the input 
demultiplexer 45 of the decoder, and deduces therefrom 
the quantized interpolation error ecx_q[n-l/2] from the 
same quantization dictionary as that used by the module 
122 of the coder. An adder 12 6 sums the cepstral 

25 vectors cx_i[n-l/2] and ecx_q[n-l/2] so as to provide 
the cepstral coefficients cx[n-l/2] which will be used 
by the decoder (modules 51-57, 95, 96, 115 and/or 
modules 85-87, 92, 95, 96, 115) so as to form the 
interpolated frame of rank n-1/2. 

30 

If just some of the cepstral coefficients have formed 
the subject of an interpolation error quantization, the 
others are determined by the decoder by a simple 
interpolation with no correction. 

35 

The decoder can also interpolate the other parameters 
Fo, Emix used to synthesize the signal frames. The 
fundamental frequency Fq can be linearly interpolated, 
either in the time domain, or (preferably) directly in 
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the frequency domain. For the possible interpolation of 
the energy weighting vector Emix, it is appropriate to 
perform the interpolation after denormalization and 
while of. course taking account of the time shifts 
between frames. 

It should be noted that it is especially advantageous 
in order to interpolate the representation of the 
spectral envelopes, to perform this interpolation in 
the cepstral domain. Unlike an interpolation performed 
on other parameters, such as the LSP coefficients 
(standing for ^^Line Spectrum Pairs") , the linear 
interpolation of the cepstral coefficients corresponds 
to the linear interpolation of the compressed spectral 
amplitudes . 

In the variant represented in figure 20, the coder uses 
the cepstral vectors cx_q[n] , cx_q[n-l] ; - • - , cx_q[n-r] 
and cx_q[n-l/2] calculated for the last frames which 
have passed (r ^ 1) so as to identify an optimal 
interpolator filter which, when fed with the quantized 
cepstral vectors cx__q[n-r], cx_q[n] relating to 

the frames of integer rank, delivers an interpolated 
cepstral vector cx_i[n~l/2] which exhibits a minimum 
distance with the vector cx[n-l/2] calculated for the 
last frame of half -integer rank. 

In the example represented in figure 20, this 
interpolator filter 128 is present in the coder, and a 
subtracter 129 deducts its output cx_i[n-l/2] from the 
calculated cepstral vector cx[n-l/2] . A minimization 
module 130 determines the parameter set {P} of the 
interpolator filter 128, for which the interpolation 
error ecx[n-l/2] delivered by the subtracter 129 
exhibits a minimum norm. This parameter set {P} is 
addressed to a quantization module 131 which provides a 
corresponding quantization index iP to the output 
multiplexer 6 of the coder . 
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As a function of the bit rate allocated in the stream O 
to the indices for quantizing the parameters {P} 
defining the optimal interpolator filter 128, it will 
be possible to adopt a finer or coarser quantization of 
5 these parameters, or a more or less elaborate form of 
the interpolator filter, or else to envisage several 
interpolator filters cpaantized differently for various 
vectors of cepstral coef f icients - 

10 In a simple embodiment, the interpolator filter 128 is 
linear, with r = 1: 

cx_i[n-l/2] = p.cx_q[n~l] + (l-p) . cx_q[n] (23) 

15 and the parameter set {P} is limited to the coefficient 
p lying between 0 and 1 . 

From the indices iP for quantizing the parameters {P} 
obtained in the binary stream 9, the decoder 
20 reconstructs the interpolator filter 128 (to within 
quantization errors) and processes the spectral vectors 
cx__q[n-r], cx_q[n] so as to estimate the cepstral 

coefficients cx[n-l/2] used to synthesize the frames of 
half -integer rank. 

25 

Generally, the decoder can use a simple interpolation 
method (without transmission of parameters by the coder 
for the frames of half -integer rank) , and an 
interpolation method with incorporation of a quantized 

30 interpolation error (according to figures 17 and 18), 
or an interpolation method with an optimal interpolator 
filter (according to figure 19) to evaluate the frames 
of half-integer rank in addition to the frames of 
integer rank evaluated directly, as explained with 

35 reference to figures 8 to 13. The temporal synthesis 
module 116 can then combine the collection of these 
frames evaluated so as to form the synthesized signal x 
in the manner explained hereinbelow with reference to 
figures 14, 21 and 22. 



- 35 - 



As in the method of temporal synthesis described above, 
the module 116 performs an overlap sum of frames 
modified with respect to those evaluated successively 
5 at the output of the module 115, and this modification 
can be viewed in two steps of which the first is 
identical to that described above with reference to 
figure 14 (divide the samples of the frame 2' by the 
analysis window fA) . 

10 

The second step (figure 21) consists in multiplying the 
samples of the renormalized frame 2" by a synthesis 
window satisfying the following properties: 

15 :^ (i) =0 for 0 < i < N/2 - M/p and N/2 + 

M/p ^ i < N (24) 

:^ (i) + :^ (i + M/p) = A for N/2 - M/p < i < N/2 (25) 

20 where A designates an arbitrary positive constant, for 
example A = 1 and p is the integer such that the time 
shift between the successive frames (calculated 
directly and interpolated) is M/p samples, i.e. p = 2 
in the example described. The synthesis window fg (i) 

25 increases progressively for i going from N/2 - M/p to 
N/2. It is, for example, a raised sinusoid on the 
interval N/2 - M/p < i < N/2 + M/p. In particular, the 
synthesis window ^ can, over this interval, be a 

Hamming window (as represented in figure 21) or a 
30 Banning window. 

Figure 21 shows the successive frames 2" repositioned 
over time by the module 116. The hatching indicates the 
removed portions of the frames (synthesis window at 0) - 
35 It may be seen that by performing the overlap sum of 
the samples of the successive frames, the property (25) 
ensures homogeneous weighting of the samples of the 
synthesized signal . 
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As in the method of synthesis illustrated by figures 14 
and 15, the procedure for weighting the frames obtained 
by inverse Fourier transform of the spectra Y can be 
5 performed in a single step,r with a compound window 
^ (i) = :^(i)/fA(i). Figure 22 shows the form of the 
compound window f^ in the case where the windows fA and 
are of Hamming type. 

10 Like the method of temporal synthesis illustrated by 
figures 14 to 11, that illustrated by figures 14, 21 
and 22 makes it possible to take into account an 
overlap L between two analysis frames (for which the 
analysis is performed completely) which is smaller than 

15 half the size N of these frames. In general, this 
latter method is applicable when the successive 
analysis frames exhibit mutual time shifts M of more 
than N/2 samples (possibly even of more than N samples 
if a very low bit rate is required) , the interpolation 

20 leading to a collection of frames whose mutual time 
shifts are less than N/2 samples. 

The interpolated frames can form the subject of a 
reduced transmission of coding parameters, as is 

25 described above, but this is not compulsory. This 
embodiment makes it possible to retain a relatively 
large interval M between two analysis frames, and hence 
to limit the transmission bit rate required, whilst 
limiting the discontinuities which are liable to appear 

30 by virtue of the size of this interval with respect to 
the typical timescales for the variations in the 
parameters of the audio signal, in particular the 
cepstral coefficients and the fundamental frequency. 

35 Figures 23 to 25 show other embodiments of the means 
employed to process the cepstral coefficients cx_sup 
delivered by the IFFT module 13 of figure 1, 
representing the upper envelope. 
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In the three cases, the post-lif tering module 15, 
normalizing module 16, quantization module 18 and 
module for calculating the spectral amplitudes 28 are 
5 essentially identical to those described previously 
with reference to figure 1. Furthermore, modules for 
post-liftering 140, for smoothing 141 and for 
extracting the minimum phase 142 are provided so as to 
process the post-lif tered and quantized cepstral 
10 coefficients cx_sup_q delivered by the quantization 
module 18. These modules 140-142 operate essentially 
like the corresponding modules 55-57 of the decoder of 
figure 8. 

S 15 In the embodiment shown in figure 23, the adaptation 
module 144 accomplishes a function similar to that of 
* the module 29 of figure 1. However, the adaptation is 

m not carried out solely on the basis of the modulus of 

m the spectrum. The module 144 determines the best set of 

20 coefficients for the post-lifter 15 by minimizing the 
M: discrepancy between the spectrum of the audio signal, 

y in terms of modulus |X| and phase (px, and of the 

Q recalculated complex values for one or more of the 

n harmonics of the fundamental frequency. The moduli of 

25 these latter complex values are given by the 
calculation module 28, and their phases correspond to 
the minimum phases (p(k) provided by the extraction 
module 142. To carry out the adaptation, the module 144 
can take into account any appropriate distance in the 
30 complex plane, for example the Euclidean distance. 

Thus, the adaptation of the post-lifter 15 by the 
module 144 takes account in a combined manner of 
frequency aspects of the signal, which are reflected by 
35 the modulus of the spectrum, and of temporal aspects, 
which are reflected by the phase of the spectrum. 



As represented dashed in figure 23, the post-lifter 140 
can also be adaptive, the adaptation performed by the 
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module 144 pertaining jointly to the two post-lifters 
15, 140- In this case, the post-lifter 55 of the 
decoder (figure 8) is adapted, like the post-lifter 
140, as a function of parameters iLif which the 
5 adaptation module 144 provides to the mutliplexer 6 so 
that it includes them in the digital stream 0. 
Typically, a few sets of coefficients yi, yz are 
envisaged for the post-lifters 140 and 55, and the 
module 144 carries out an exhaustive test of these 
10 various sets of coefficients so as to retain the one 
which minimizes the discrepancy in the complex plane. 

In the example represented in figure 24, the adaptation 
module 29 for the post-lifter 15 is identical to that 

15 of figure 1. Figure 24 shows a module 145 for 
estimating a masking curve allowing the module 29 to 
select, for the minimization of the discrepancy in 
terms of modulus, the harmonic frequency or frequencies 
which most exceed the masking curve calculated on the 

20 basis of the modulus spectrum |X|, as described above. 

The post-lifter 140 of figure 24 is adapted separately 
by a module 14 6 which carries out the minimization of 
the discrepancies between the phase <px of the spectrum 

25 of the signal and the minimum phase <p(k) calculated by 
the module 142 for one or more of the harmonics. Here, 
again, the harmonics selected for the calculation of 
the minimized phase discrepancy may be so as a function 
of the masking curve estimated by the module 145. The 

30 module 14 6 provides the output multiplexer 6 of the 
coder with the parameters iLif which represent the 
optimal post-lifter 140, so that they are used in the 
post-filter 55 of the decoder. 

35 In the example illustrated by figure 25, the post- 
lifter 140 serving in the calculation of the minimum 
phases is not adaptive. The minimum phases <p(k) 
calculated by the module 142 for the harmonics of the 
fundamental frequency are compared with the phases cpx 



of the spectrum of the audio signal, and the phase 
discrepancy forms the subject of a quantization by a 
module 148. The corresponding quantization indices iAq> 
are provided by the module 148 to the output 
multiplexer 6 of the coder. 

In a decoder (figure 26) corresponding to a coder 
according to figure 25, a module 14 9 utilizes these 
quantization indices iA<p provided by the demultiplexer 
45 to obtain the values of the quantized phase 
discrepancies / which are added by an adder 150 to the 
minimum phases q)(k) calculated by the module 57 (the 
post-lifters 140 and 55 being identical) . The phases 
provided by the adder 150 are then used by the module 
54 which synthesizes the spectral lines of the harmonic 
component Xv. 

The phase discrepancy quantized by the 'module 148, and 
which is used by the modules 14 9 and 150 of the decoder 
to correct the minimum phases <p(k), can be of two 
kinds : 

- it can represent/ for each frequency of index i 
corresponding to a harmonic of order k of the 
fundamental frequency For the difference between 
the phase (px{i) of the spectrum of the signal at 
the frequency i and the minimum phase (p(k) 
calculated by the module 142 for harmonic k; 

- alternatively or cumulatively, this phase 
discrepancy can represent the variation of the 
phase (px of the spectrum over the width of one or 
more spectral peaks corresponding to harmonics of 
the signal, this variation relating to the minimum 
phase (p(k) assigned to the peaks in question. 

In both cases, the peak or peaks for which the phase 
discrepancy is quantized may be chosen as a function of 
the spectral energy represented by the upper envelope. 



which is available to the coder and to the decoder, 
thereby enabling the decoder to determine that spectral 
line to which the discrepancies should be applied. 

In the first case, the phase discrepancies may form the 
subject of a scalar quantization, or a vector 
quantization if they are grouped together for several 
peaks . 

In the second case, the variation of the phase (px 
around the minimum phase 9(k) over the width of a 
harmonic peak (determined by the width of the reference 
line used by the module 54), can be represented simply 
by the slope of a linear segment selected as being that 
which exhibits a minimum quadratic distance with the 
curve of the variation in phase of the spectrum over 
the width of the line, and possibly by a shift at the 
origin. 

These slopes may form the subject of a scalar 
quantization, or a vector quantization if they are 
grouped together for several peaks . 

The quantization of the phase variations over the 
harmonic peaks may pertain to the collection of 
harmonic frequencies. Another possibility is to 
quantize several slopes each obtained by averaging the 
slopes at the harmonics over one or more subbands of 
the spectrum. This averaging can be weighted so as to 
take account of the energies relating to the various 
harmonic frequencies, represented by the upper 
envelope . 

The module 148 can also model the phase variation over 
the width of a peak by a more complex curve than a 
linear segment, for example a spline, whose parameters 
are quantized so as to be transmitted to the decoder. 



Another possibility is to perform prior learning of 



phase models at the harmonics, representative of the 
phase variations over the width of the peaks, which 
variations are observed in a corpus of reference 
signals. These models are held in a dictionary stored 
by the modules 148 and 149. The module 148 of the coder 
determines the indices iA(p corresponding to the 
addresses of the models closest to the phase variations 
in the neighborhood of the harmonic peaks considered, 
and the module 14 9 of the decoder recovers these models 
for the synthesis of the phase of the harmonic 
component . 
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CLAIMS 

1. A method of coding an audio signal (x) , in which a 
fundamental frequency (Fo) of the audio signal is 
5 estimated, a spectrum of the audio signal is 

determined through a transform in the frequency 
domain of a frame of the audio signal, and data 
for coding a harmonic component of the audio 
signal, comprising data representative of spectral 
10 amplitudes associated with frequencies which are 

multiples of the fundamental frequency, are 
included in a digital output stream (O) , in which 
the spectral amplitude associated with one of said 
^ frequencies which are multiples of the fundamental 

e. 

I 15 frequency is a local maximum of the modulus of the 

I spectrum in the neighborhood of said multiple 

frequency, and in which said data representative 
of spectral amplitudes associated with frequencies 
which are multiples of the fundamental frequency 
20 (Fo) are obtained by means of cepstral 

coefficients (cx_sup) calculated by transforming 
in the cepstral domain a compressed upper envelope 
(LX_sup) of the spectrum of the audio signal. 

25 2. The method as claimed in claim 1, in which the 
compressed upper envelope (LX__sup) is determined 
by interpolation of said spectral amplitudes 
associated with the frequencies which are 
multiples of the fundamental frequency (Fo) , with 
30 application of a spectral compression function. 

3. The method as claimed in claim 2, in which the 
interpolation is performed between points whose 
abscissa is a frequency which is a multiple of the 
35 fundamental frequency (F©) and whose ordinate is 

the spectral amplitude associated with said 
multiple frequency, compressed or uncompressed. 
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4 . The method as claimed in any one of the preceiding 
claims, in which the transformation in the 
cepstral domain of the compressed upper envelope 
(LX_sup) is performed according to a nonlinear 
frequency scale. 

5. The method as claimed in any one of the preceding 
claims, in which the cepstral coefficients 
(cx^sup) are quantized so as to form said data 
representative of the spectral amplitudes 
associated with the frequencies which are 
multiples of the fundamental frequency (Fo) . 

6. The method as claimed in claim 5, in which the 
quantization of the cepstral coefficients (cx__sup) 
pertains to a prediction residual for each of the 
cepstral coefficients . 

7. The method as claimed in claim 6, in which the 
prediction residual for a cepstral coefficient is 
of the form (cx[n,i] - a (i) • rcx_q [n-1, i] ) / [2- 
a(i)], where cx[n,i] designates a current value of 
said cepstral coefficient, rcx_q[n-l,i] designates 
a previous value of the quantized prediction 
residual, and a(i) designates a prediction 
coefficient . 

8. The method as claimed in claim 6 or 7, in which 
different predictors are employed to determine the 
prediction residuals for at least two of the 
cepstral coefficients . 

9. The method as claimed in any one of claims 5 to S, 
in which the cepstral coefficients (cx_sup) are 
distributed into several cepstral subvectors 
quantized separately by a vector quantization 
pertaining to a prediction residual of the 
cepstral coefficients . 
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The method as claimed in any one of claims 5 to 9, 
in which the cepstral coefficients (cx_sup) are 
normalized before quantization, by modifying the 
cepstral coefficient of order 0 in such a way that 
the spectral amplitude associated with a frequency 
which is a multiple of the fundamental frequency 
(Fo) is represented exactly by the normalized 
cepstral coefficients . 

The method as claimed in any one of claims 5 to 
10^ in which the cepstral coefficients (cx_sup) 
are transformed by liftering in the cepstral 
domain before being quantized. 

The method as claimed in claim 11, in which the 
liftering is of the form Cp(i) = [1+72^ 
yi^].c(i) - , where Cp(i) and c(i) designate 

the cepstral coefficient of order i>0 respectively 
before and after liftering, yi and 72 are 
coefficients lying between 0 and 1 and |li is a pre- 
emphasizing coefficient. 

The method as claimed in claim 12, in which = 
{72 - 7i) -cd) . 

The method as claimed in any one of claims 11 to 
13, in which a value of the modulus of the 
spectrum of the audio signal at at least one 
frequency which is a multiple of the fundamental 
frequency (Fo) is recalculated on the basis of the 
transformed and quantized cepstral coefficients 
(cx_sup_q) , and said liftering is adapted in such 
a way as to minimize a discrepancy in modulus 
between the spectrum of the audio signal and at 
least one recalculated modulus value. 

The method as claimed in any one of claims 11 to 
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13^ in which a value of the modulus of the 
spectrum of the audio signal at at least one 
frequency which is a multiple of the fundamental 
frequency (Fq) is recalculated on the basis of the 
transformed and quantized cepstral coefficients 
(cx_sup_q) , the cepstral coefficients are 
retransformed by liftering and smoothing in the 
cepstral domain, minimum phases {(p(k)) of the 
audio signal at frequencies which are multiples of 
the fundamental frequency are calculated on the 
basis of the retransformed cepstral coefficients 
(cxl[n]), and the liftering performed before the 
quantization is adapted in such a way as to 
minimize a deviation between the spectrum of the 
audio signal and at least one complex value whose 
modulus has a value recalculated for a frequency 
which is a multiple of the fundamental frequency 
and whose phase is given by the. minimum phase 
calculated for said multiple frequency. 

The method as claimed in claim 15, in which the 
lifterings performed before and after quantization 
are adapted jointly so as to minimize said 
discrepancy, and in which parameters (iLif) 
representative of the adapted liftering performed 
after quantization are included in the data for 
coding the harmonic component. 

The method as claimed in any one of claims 14 to 
16, in which the minimized discrepancy for the 
adaptation of the liftering relates to at least 
one frequency which is a multiple of the 
fundamental frequency (Fq) , selected on the basis 
of the magnitude of the modulus of the spectrum in 
absolute value. 

The method as claimed in any one of claims 14 to 
16, in which a curve of spectral masking of the 
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audio signal is estimated by means of a psycho- 
acoustic model, and the minimized discrepancy for 
the adaptation of the liftering relates to at 
least one frequency which is a multiple of the 
fundamental frequency (Fq) , selected on the basis 
of the magnitude of the modulus of the spectrum in 
relation to the masking curve. 

The method as claimed in claim 1, in which the 
spectrum of the audio signal and the cepstral 
coefficients (cx_sup) resulting from the 
transformation of the compressed upper envelope 
are determined for successive frames of N samples 
of the audio signal which exhibit mutual overlaps, 
and in which said data representative of spectral 
amplitudes associated with the frequencies which 
are multiples of the estimated fundamental 
frequency (fo) , obtained by means of the cepstral 
coefficients calculated by transforming the 
compressed upper envelope, are included in the 
digital output stream (O) for just one subset of 
the frames. 

The method as claimed in claim 19, in which, for 
the frames which do not form part of said subset, 
data (icx[n-l/2]) for quantizing an error (ecx[n- 
1/2]) of interpolation of the cepstral 
coefficients resulting from the transformation of 
the compressed upper envelope (LX_sup) are 
included in the digital output stream (€>) . 

The method as claimed in claim 19, in which, for 
the frames which do not form part of said subset, 
an optimal interpolator filter (128) is determined 
for the cepstral coefficients resulting from the 
transformation of the compressed upper envelope 
(LX_sup) and data (iP) representing said optimal 
interpolator filter are included in the digital 

AMENDED SHEET 



06-2001 - 47 - FR 000001908 

output stream (<E>) . 

An audio coder, comprising means for executing a 
method according to any one of the preceiding 
claims . 
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